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FIGURE 1 
Blastp vs. NCBI-nr 

>dbj |BAB68513.1| hatching enzyme EHE4 [Anguilla japonica) 
Length =271 

Score = 197 bits (502), Expect = le-49 

Identities = 103/233 (44%), Positives = 141/233 (60%), Gaps = 5/233 (2%) 

Query: 52 DKDI PAINQGL I LEETPES S FLI EGDI IRPSPFRLLSATSNK- -WPMGGSGWEVPFLLS 109 

D D I ++ S L+EGD+I + + +N+ W G+VEVP+ +S 

Sbjct: 41 DPDDVDITTSILQSTWGSSEILMEGDLIVSNTRNAMKCW^C^ 100 

Query: 110 SKYDEPSHQVILEALAEFERSTCIRFVTYQDQRDF I S I X PMYGCFSSVGRSGGMQWSLA 169 

+ + + + I A+ F TCIRFV QRDFISI GC+S +GR+GG QWSLA 

Sbjct: 101 NEFSYYHKKRIENAMKTFOTETCIRFVPRSSQRDFISIESRIX5CYSYLGRTGGKQVVSLA 160 

Query: 170 PT-CLQKGRGIVLHELMHVXGFVWEHTRADRDRYIRVNWNEILPGFEINFIKSQSSNMLT 228 

C+ GI+ HEL H LGF+HEHTR+DRD Y+++NW + P NF ++N+ T 
Sbjct: 161 RYGCVY - -HGI IQHELNHAIX3FYHEHTRSDRDEYVKINWENVAPHTI YNFQTQDTNNLNT 218 

Query: 229 P YD Y S S VMH YGRLAF SRRGL PT I T PLWA PS VH I GQRWNL S ASD ITRVLKLYGC 281 

PYDY+S+MHYGR AFS G+ TITP+ P+ IGQR + + S DI R+ KLY C 
Sbjct: 219 PYDYTSIMHYGRTAFSTNGMDTITPVPNPNQSIGQRRSMSRGDILRIKKLYSC 271 



Tblastn vs. NCBI-est 

Tissue - Uterus tumour 

>gb|Bl061462.1|BI061462 IL3-UT0117-070301-494-H12 UT0117 Homo sapiens cDNA. 
Length = 652 

Score = 175 bits (443), Expect = 2e-42 
Identities = 85/86 (98%), Positives = 85/86 (98%) 
Frame = -2 

Query: 29 SCAGACGTSFPDGLTPEGTQASGDKDI PAINQGLI LEETPES SFLIEGDI I RPSPFRLLS 88 

SC AG ACGTSF PDGLTPEGTQ ASGDKD I PA I NQG L I L EETPE S SF L I EGD 1 1 RP S PF RLL S 
Sbjct: 546 SCAGACGTSFPDGLTPEGTQASGDKDI PAINQGL I LEETPESSFLI EGDI I RPSPFRLLS 367 

Query: 89 ATSNKWPMGGSGWEVPFLLSSKYDE 114 

ATSNKWPMGG SGWEV PF LL S SKY E 
Sbjct: 366 ATSNKWPMGG SGWEVPFLLSSKYGE 289 
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FIGURE 2 



Library 


Tissue/ceil source 


Vector 


Host strain 


Supplier 


Cat no. 


1 


human fetal brain 


Zap 11 


X LI -Blue MRF 


Stratagene 


936206 


2 


human ovary 


GT10 


LE392 


Clontech 


HL1098a 


3 


human pituitary 


GT10 


LE392 


Clontech 


HL1097a 


4 


human placenta 


GT11 


LE392 


Clontech 1 


HLI075b 


5 


human testis 


GT1I 


LE392 


Clontech 


HLIOlOb 


6 


human substantia nigra 


CTI0 


LE392 


in house 




7 


human fetal brain 


GT10 


LE392 


in house 




8 


human cortex brain i 


GTI0 


LE392 


in house 




9 


human colon 


GTIO 


LE392 


Clontech 


H LI 034a 


10 


human fetal brain 


GTIO 


LE392 


Clontech 


H LI 065a ! 




human fetal lung 


GT10 


LE392 


Clontech 


HL1072a 


12 


human fetal kidney 


GTIO 


LE392 


Clontech 


HL1071a 


13 


human fetal liver 


GTIO 


LE392 


Clontech 


HL1064a 


14 


human bone marrow 


GTIO 


LE392 


Clontech 


HL1058a 


15 


human peripheral blood monocytes 


GTIO 


LE392 


Clontech 


HL1050a 


16 


human placenta 


GTIO 


LE392 


in house 




17 


human SHSYSY 


GTIO 


LE392 


in house 




18 


human U373 cell line 


GTIO 


LE392 


in house 




19 


human CFPoc-I cell line 


Uni Zap 


XLI-BIue MRF 


Stratagene 


936206 


20 


human retina 


GTIO 


LE392 


Clontech 


H LI 1 32a 


21 


human urinary bladder 


GTIO 


LE392 


in house 




22 


human platelets 


Uni Zap 


XLl-Blue MRF 


in house 




23 


human neuroblastoma Kan + TS 


GTIO 


LE392 


in house 




24 


human bronchial smooth muscle 


GTIO 


LE392 


in house 




25 


human bronchial smooth muscle 


GTIO 


LE392 


in house 




26 


human Thymus 


GTIO 


LE392 


Clontech 


HLII27a 


: 27 


human spleen 5* stretch 


GTU 


LE392 


Clontech 


HL1134b 


28 


human peripheral blood monocytes 


GTIO 


LE392 


| Clontech 


HL1050a 


29 


human testis 


GTIO 


LE392 


Clontech 


HLI065a 


30 


human fetal brain 


GTIO 


| LE392 


Clontech 


HL1065a 


31 


human substantia nigra 


GTIO 


LE392 


Clontech 


HL1093a 


32 


human placental 1 1 


GT1I 


LE392 


Clontech 


HLl075b 


33 


human Fetal brain 


GTIO 


LE392 


Clontech 


custom 1 


34 


human placenta #59 


GTIO 


LE392 


Clontech 


HL5014a 


35 


human pituitary 


GTIO 


LE392 


Clontech 


HLI097a 


36 


human pancreas #63 


Uni Zap XR 


XLI-BIue MRF 


Stratagene 


937208 


37 


human placenta #19 


GTI1 


LE392 


Clontech 


HL1008 


38 


human liver S'stretch 


GTll 


LE392 


Clontech 


HLlll5b 


39 


human uterus 


Zap-CMV XR 


XLI-BIue MRF 


Stratagene 


980207 


40 


human kidney large-insert cDNA library 


TriplEx2 


XLI-BIue 


Clontech 


HL5507u 
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FIGURE 3 



1 AGGTCCTTGT GGACAATAGC TATTCTTCTT GGCTCTGTCG CTTCCCTTCA CTGGGTGCAG 
61 GTGACTGTGG GGGTGTCCCC AAATGCTGCC CAGCGCTGAC ATGCTCCGCC TCTGGGATTT 

m 1 r 1 w d 

121 CAATCCAGGT GGGGCCCTGA GTGACCTGGC TCTGGGGCTC AGGGGTATGG AGGAGGGGGG 
fnpg gal sdl a 1 g 1 rgm eeg 

181 ATATAGCTGC GCAGGAGCCT GTGGTACCAG CTTCCCAGAT GGCCTCACCC CTGAGGGAAC 
gysc aga cgt sfpd git peg 

241 CCAGGCCTCC GGGGACAAGG ACATTCCTGC AATTAACCAA GGGCTCATCC TGGAAGAAAC 
tqas gdk dip ainq gli lee 

301 CCCAGAGAGC AGCTTCCTCA TCGAGGGGGA CATCATCCGG CCGAGTCCCT TCCGACTGCT 
tpes sfl ieg diir ps p frl 

361 GTCAGCAACC AGCAACAAAT GGCCCATGGG TGGTAGTGGT GTCGTGGAGG TCCCCTTCCT 
lsat snk wpm ggsg vve vpf 

421 GCTCTCCAGC AAGTAC G ATG AGCCCAGCCA TCAGGTCATC CTGGAGGCTC TTGCGGAGTT 
llss kyd eps hqvi lea lae 

481 TGAACGTTCC ACGTGCATCA GGTTTGTC AC CTATCAGGAC CAGAGAGACT TCATTTCCAT 
fers t ci rfv tyqd qrd fis 

541 CATCCCCATG TATGGGTGCT TCTCGAGTGT GGGGCGCAGT GGAGGGATGC AGGTGGTCTC 
iipm ygc fss vgrs ggm qvv 

601 CCTGGCGCCC ACGTGTCTCC AGAAGGGCCG GGGCATTGTC CTTCATGAGC TCATGCATGT 
slap tcl qkg rgiv lhe lmh 

CP1 

661 GCTGGGCTTC TGGCACGAGC ACACGCGGGC CGACCGGG AC CGCTATATCC GTGTCAA CTG 
vlgf whe htr adrd ryi rvn 

721 GAACGAGATC CTGCCAGGCT TTGAAATCAA CTTCATCAAG TCTCAGAGCA GCAACATGCT 
wnei lpg fei nfik sqs snm 

781 GACGCCCTAT GACTACTCCT CTGTGATGCA CTATGGGAGG CTCGCCTTCA GCCGGCGTGG 
ltpy dys svm hygr laf srr 

78836-GR1-3* 

841 GCTGCCCACC ATCACACCAC TTTGGGCCCC C AGTGTCCAC ATCGGCCAGC GATGGAA CCT 
glpt itp lwa p s v E i g q r w *n 



901 GAGTGCCTCG GACATCACCC GGGTCCTC ^V ACTCTACGGC TGCAGC CCAA GTGGCCCCAG 
lsas d J. t rv 1^ k 1 y g c s p s g p 

...... m> ppo 

78836-GR1nest-3* 

961 GCCCCGTGGG AGAGGTGAGT GGCATGGCAG GAAGGTG AC T TGAACCTGGA GAAGGCGCCT 
rprg rge whg rkvt 

1021 GTGCTCTAAT GGTGTCAGGG AGGGTGACAA GGAGGGAGAT GAGGTTGCAG GGGGAGCAGG 
1081 GTGAGATCAC GGGGGCTTGC CAC 



Position and sense of PCR primers 
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FIGURE 4 



Primer 


Name 


Sequence (5'-3*) 


CP1 


4C5 


ACC GCT ATA TCC GTG TCA A 


CP2 


4C6 


GCT GCA GCC GTA GAG TTT 


GeneRacer 3' 




GCT GTC AAC GAT ACG CTA CGT AAC G 


78836-GR1-3' 




AGT GTC CAC ATC GGC CAG CGA TGG AA 


GeneRacer 3' nested 




CGC TAC GTA ACG GCA TGA CAG TG 


78836-GRlnest-3' 




ATG GAA CCT GAG TGC CTC GGA CAT C 


78836-FL-F 


4C7 


CTG TCA GCA ACC AGC AAC AA 


78836-FL-R 


9B2 


AGC CAC AGG CTT AAT CTT CG 


78836-FL2-F 


9E6 


TCT ACC ATG GAG GGT GTA GG 
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FIGURES 



61 



121 



181 



241 



301 



361 



421 



481 



541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 



ATGGAACCTG AGTGCCTCGG ACATCACCCG GGTCCTCAAA CTCTACGGCT 
wnl sas dit rvlk lyg 

TGGCCCCAGG CCCCGTGGGA GAGGGTCCCA TGCCCACAGC AC TGGTAGG A 
sgpr prg rgs hahs tgr 

GGCCTCCCTA TCTCTGCAGC GGCTTTTGGA GGCACTGTCG GCGGAATCCA 
pasl slq rll eals aes 

CCCCAGTGGT TCCAGTGCGG GAGGCCAGCC CGTTCCTGCA GGGCCTGGGG 

dpsg ssa ggq pvpa gpg 

TGGGTGGGAG TCCCCTGCCC TGAAAAAGCT CAGTGCAGAG GCCTCGGCAA 
hgwe spa lkk lsae asa 

GACCCTAGCT TCCTCCCCAA GATCAAGGCC TGGAGCAGGT GCCCCCGGTG 
q t 1 a ssp rsr pgag apg 

GCAGTCCTGG CTGGCCGGAG TGTCCACCAA GCCCACAGTC CCATCTTCAG 
eqsw lag vst kptv pss 

CCAGCCAGTC CCTGTCCAGG GAAGCCCAGC TCTGCCAGGG GGCTGTGTAC 
iqpv pvq gsp alpg gcv 

TTTCAAGGGG ATGTCCGAAG ATTAAGCCTG 
hfkg mse d 

CTCTGCCCAG TGGAGCTGGG TCGTCTACCT CTTGGCTCCT TTGGGCCACA 
CCAGCCCCAA CCTACCACCC CATCTCAGAG GGCCAGGACT CTTCCCCTGT 
TGTGTTCCCC TAAGGGCTCC TAGGGCCAGG GGTTCTTCTA GCTCTGCCAC 
AGGCCTGGCT GTGCCTGCTC TTGACTTTTG CCCAGCCCTG GTGGATGCTG 
GTGACATTCT CCAGGGACAG GTCCTGGAAG GGGTGGGGAA GAGGTAGGTT 
AGAACCCTGG AATCCCTCCT GTGCCTGAGG CCCTGCCCCC CAGCATGGAC 
CCTACCTCTC CCTCAGGGCA GCCCTGTGGC TGGGACCCTG GGAACAGCCT 
CCAACATGCC CAAGTGTGGG GGAATGTTCT ACAGCAGTGT AGCCTCCAGC 
AGGAGGCTTT GAGAGCCCAA CTTACTCCCC TGC AGAGC AG GAAGGTGGTA 
GGCCACCATT GGGGAGACGA GAAAGAAGTG GGGCCCCACC AGATTGCACA 
CAGCTGGCCC CTGAAC AGAG GACTCAGTTG TCTCCACCCT ACACCGCTAT 
TCAGCCAGGC GCAGCCTTGG AAGGAGAAAG GGCTGGGGTT ACCTGGCTTG 
GGAAAGCCCC CTTCCTCCTC TGCCCCAGCT CCCAGCCTGG CCTCCTCCAG 
CTCCTCTGCC CCAGCTCCGG CTTTCCCCAT GAGGTTTGTC CCAGGCATGA 
CAGGGTGCCA ATGAGTGGGC CTAGGCCAGA GGCCCCTCAG TCCCCAAGGG 
GTGGCCTTTC AGAGGGTCAA GGAAGCCCTG CTTGGGGTAG AAGGGGCAGG 
GTTGGGGGAG GAAATAAAGT GGAGTGTGCT GTGCTGAAAA AAAAAAAAAA 



GCAGCCCAAG 
c s p 

GCCCCGCTCC 
spa 

GGAGCCCCGA 
r s p 

AGAGCCCACA 
e s p 

GGCAGCCTCA 
r q p 

TTGCTCAGGA 
v a q 

AAGCAGGAAT 
e a g 

CTAGAAATCA 
p r n 

TGGCTTCTGT CCCCAAGTAG GGAGGGCATC 



CCACTGTCTT 
CTCTCTTCAC 
AGGGGAAGGC 
GGAATGGGAG 
CCAGCCCCGC 
TAATGGTGTC 
CCCATCCCAC 
CCTTCTCTCC 
GGTCAAGTGT 
ATGGGAACCT 
TCCCTGGAGC 
TCCTCCTCCA 
GCAGGCCCTA 
AGAAAGCATC 
TACTGTTTTG 
AGCCCCACAT 
AAAA 



taa Stop codon 

aataaa Consensus polyadenylation site 
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FIGURE 6 



Primer 


Sequence (5'-3') 


T3 


ATT AAC CCT CAC TAA AGG GA 


T7 


TAA TAC GAC TCA CTA TAG GG 


SP6 


ATT TAG GTG AC A CTA TAG 
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FIGURE 7 



1 CTGTCAGCAA CCAGCAACAA ATGGCCCATG 

► m 

78836-FL-F 

61 CTGCTCTCCA GCAAGTACGA TGAGCCCAGC 
lis sky deps 

121 TTTGAACGTT CCACGTGCAT CAGGTTTGTC 
fer stc irfv 

181 ATCATCCCCA TGTATGGGTG CTTCTCGAGT 
iip m y g cfss 

241 TCCCTGGCGC CCACGTGTCT CCAGAAGGGC 
sla ptc lqkg 

301 GTGCTGGGCT TCTGGCACGA GCACACGCGG 
v 1 g fwh ehtr 

361 TGGAACGAGA TCCTGCCAGG CTTTGAAATC 
wne ilp gfei 

421 CTGACGCCCT ATGACTACTC CTCTGTGATG 
ltp ydy ssvm 

481 GGGCTGCCCA CCATCACACC ACTTTGGGCC 
glp tit plwa 

541 CTGAGTGCCT CGGACATCAC CCGGGTCCTC 
lsa sdi trvl 

601 AGGCCCCGTG GGAGAGGGTC CCATGCCCAC 
rpr grg shah 

661 CTATCTCTGC AGCGGCTTTT GGAGGCACTG 
lsl qrl leal 

721 GGTTCCAGTG CGGGAGGCCA GCCCGTTCCT 
gss agg qpvp 

781 GAGTCCCCTG CCCTGAAAAA GCTCAGTGCA 
esp a 1 k klsa 

841 GCTTCCTCCC CAAGATCAAG GCCTGGAGCA 
ass prs rpga 

901 TGGCTGGCCG GAGTGTCCAC CAAGCCCACA 
wla gvs tkpt 

961 GTCCCTGTCC AGGGAAGCCC AGCTCTGCCA 
vpv qgs palp 

1021 GGGATGT CCG AAGATTAAGC CTGTGGCT 
g m s e d 

78836-FL-R 



GGTGGTAGTG GTGTCGTGGA GGTCCCCTTC 
ggs gvv evpf 

CGCCAGGTCA TCCTGGAGGC TCTTGCGGAG 
rqv ile alae 

ACCTATCAGG ACCAGAGAGA CTTCATTTCC 
tyq dqr dfis 

GTGGGGCGCA GTGGAGGGAT GCAGGTGGTC 
vgr sgg mqvv 

CGGGGCATTG TCCTTCATGA GCTCATGCAT 
rgi vlh elmh 

GCCGACCGGG ACCGCTATAT CCGTGTC AAC 
adr dry irvn 

AACTTCATCA AGTCTC AG AG CAGCAACATG 
nfi ksq ssnm 

CACTATGGGA GGCTCGCCTT CAGCCGGCGT 
hyg rla fsrr 

CCCAGTGTCC ACATCGGCCA GCGATGGAAC 
psv hig qrwn 

AAACTCTACG GCTGCAGCCC AAGTGGCCCC 
kly gcs psgp 

AGCACTGGTA GGAGCCCCGC CCCGGCCTCC 
stg rsp apas 

TCGGCGGAAT CCAGGAGCCC CGACCCCAGT 
sae srs pdps 

GCAGGGCCTG GGGAGAGCCC ACATGGGTGG 
agp ges phgw 

GAGGCCTCGG CAAGGCAGCC TCAGACCCTA 
eas arq pqtl 

GGTGCCCCCG GTGTTGCTCA GGAGCAGTCC 
gap gva qeqs 

GTCCCATCTT CAGAAGCAGG AATCCAGCCA 
vps sea giqp 

GGGGGCTGTG TACCTAGAAA TC ATTTC AAG 
ggc vpr nhfk 
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FIGURE 8 



Query= INSP005a 

(336 letters) 

Database: All non-redundant GenBank CDS 
trans lations+PDB+SwissProt+PIR+PRF 

1,247,039 sequences; 397,579,747 total letters 

Searching done 



Sequences producing significant alignments: 



Score 



ref 
dbj 
dbj 
dbj 
dbj 
dbj 
dbj 
dbj 
pir 
dbj 



XP_141346.1| similar to hatching enzyme EHE7 [Anguilla japon. 
BAB68518.1 hatching enzyme EHE13 [Anguilla japonica] 
BAB68515.1 hatching enzyme EHE7 [Anguilla japonica] 
BAB68516.1 hatching enzyme EHE10 [Anguilla japonica] 
BAB68513.1 hatching enzyme EHE4 [Anguilla japonica] 
BAB68517.1 hatching enzyme EHE12 [Anguilla japonica] 
BAB68514.1 hatching enzyme EHE6 [Anguilla japonica] 
BAB68519.1 hatching enzyme EHE14 [Anguilla japonica] 
[C4882 6 high choriolytic hatching proteinase (EC 3.4.24.-) H. 
BAA12146.1| choriolysin H [Oryzias latipes] 



(bits) 


Value 


416 


e-115 


187 


2,e-46 


186 


4e-46 


186 


4e-46 


186 


5e-46 


183 


3e-45 


183 


3e-45 


182 


7e-45 


171 


le-41 


171 


2e-41 



Top alignment to known metalloproteinase: 



>dbj |BAB68518.1| hatching enzyme EHE13 [Anguilla japonica] 
Length = 271 

Score = 187 bits (475), Expect = 2e-46 

Identities = 93/183 (50%), Positives = 124/183 (66%), Gaps = 3/183 (1%) 

Query: 5 GWEVPFLLSSKYDEPSRQVILEAIJ^FERSTCIRFVTYQDQRDFISIIPMYGCFSSVGR 64 

G+VEVP+ +SS++ ++ I A+ F TCIRFV QRDFISI GC+S +GR 

Sbjct: 91 GLVEVPYWSSEFSYYHKKRIENAMETFNTETCIRFVPRSSQRDFISIESRDGCYSYLGR 150 

Query: 65 SGGMQVVSIxAPT-CLQKGRGIVTjHELMHVI^ 123 

+GG QWSLA C+ GI+ HEL H LGF+HEHTR+DRD Y+++NW + P NF 

Sbjct: 151 TGGKQWSLARYGCVY- -HGI IQHELNHALGFYHEHTRSDRDEYVKINWENVAPHTI YNF 208 

Query: 124 IKSQSSNMX/TPYDYSSVMHYGRLAFSRRGLPTITPLWAPSW 183 

+ ++N+ TPYDY+S+MHYGR AFS G+ TITP+ P+ IGQR ++S DI R+ KL 
Sbjct: 209 QEQDTNNLNTPYI5YTSIMHYGRTAFSTNGMDTITPVPNPNQSIGQRRSMSKGDILRINKL 268 

Query: 184 YGC 186 
Y C 

Sbjct: 269 YSC 271 
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FIGURE 9 

Molecule: pCR4 TOPO-IPAAA7 8836-1 , 5005 bps DNA Circular 

File Name: 13164. cm5, dated 24 Oct 2002 

Description: Ligation of inverted 78836_F2/R8 PCR product into pCR4- 
TOPO linear vector* 



Molecule Features: 



Type 


Start 


End 




Name 


Description 


REGION 


205 


221 




Ml 3 


rev priming site 


MARKER 


243 






T3 




REGION 


262 


294 






Polyl inker 1 


REGION 


294 


294 






TOPO cloning site' 


GENE 


1315 


308 


C 


IPAAA78836-1 




REGION 


1342 


295 


C 




Inserted PCR product 


REGION 


1343 


1360 






' Polyl inker 


REGION 


1343 


1343 






•TOPO cloning site 


MARKER 


1395 




C 


T7 




REGION 


1403 


1418 




Ml 3 




GENE 


2207 


3001 




KanR 




GENE 


3205 


4065 




AmpR 




REGION 


4210 


4883 




pUC ori 
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FIGURE 10 



78836-FL2-F 

1 TTCTACCAT G GAGGGTGTAG ^G GGGTCTCTG GCCTTGGGTG CTGGGTCTGC TCTCCTTGCC 
m g v ^ g gl wpwv lgl lsl 

61 AGGTGTGATC CTAGGAGCGC CCCTGGCCTC CAGCTGCGCA GGAGCCTGTG GTACCAGCTT 
pgvi lga pla ssca gac gts 

121 CCCAGATGGC CTCACCCCTG AGGGAACCCA GGCCTCCGGG GACAAGGACA TTCCTGCAAT 
fpdg ltp egt qasg dkd ipa 

181 TAACCAAGGG CTCATCCTGG AAGAAACCCC AGAGAGCAGC TTCCTCATCG AGGGGGACAT 
inqg lil eet pess fli egd 

241 CATCCGGCCG AGTCCCTTCC GACTGCTGTC AGCAACCAGC AACAAATGGC CCATGGGTGG 
iirp spf rll sats nkw pmg 

301 TAGTGGTGTC GTGGAGGTCC CCTTCCTGCT CTCCAGCAAG TACGATGAGC CCAGCCGCCA 
gsgv vev pfl lssk yde psr 

361 GGTCATCCTG GAGGCTCTTG CGGAGTTTGA ACGTTCCACG TGCATCAGGT TTGTCACCTA 
qvil eal aef erst cir fvt 

421 TCAGGACCAG AGAGACTTCA TTTCCATCAT CCCCATGTAT GGGTGCTTCT CGAGTGTGGG 
yqdq rdf isi ipmy gcf ssv 

481 GCGCAGTGGA GGGATGCAGG TGGTCTCCCT GGCGCCCACG TGTCTCCAGA AGGGCCGGGG 
grsg gmq vvs lapt clq kgr 

541 CATTGTCCTT CATGAGCTCA TGCATGTGCT GGGCTTCTGG CACGAGCACA CGCGGGCCGA 
givl hel mhv lgfw heh tra 

601 CCGGGACCGC TATATCCGTG TCAACTGGAA CGAGATCCTG CCAGGCTTTG AAATCAACTT 
drdr yir vnw neil pgf ein 

661 CATCAAGTCT CGGAGCAGCA ACATGCTGAC GCCCTATGAC TACTCCTCTG TG ATGC ACTA 
fiks rss nml tpyd yss vmh 

721 TGGGAGGCTC GCCTTCAGCC GGCGTGGGCT GCCCACCATC ACACCACTTT GGGCCCCCAG 
ygrl afs rrg lpti tpl wap 

781 TGTCCACATC GGCCAGCGAT GGAACCTGAG TGCCTCGGAC ATCACCCGGG TCCTCAAACT 
svhi gqr wnl sasd itr vlk 

841 CTACGGCTGC AGCCCAAGTG GCCCCAGGCC CCGTGGGAGA GGGTCCCATG CCCACAGCAC 
lygc sps gpr prgr gsh ahs 

901 TGGTAGGAGC CCCGCTCCGG CCTCCCTATC TCTGCAGCGG CTTTTGGAGG CACTGTCGGC 
tgrs pap asl slqr lie als 

961 GGAATCCAGG AGCCCCGACC CCAGTGGTTC CAGTGCGGGA GGCCAGCCCG TTCCTGCAGG 
aesr spd psg ssag gqp vpa 

1021 GCCTGGGGAG AGCCCACATG GGTGGGAGTC CCCTGCCCTG AAAAAGCTCA GTGCAGAGGC 
gpge sph gwe spal kkl sae 
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1081 CTCGGCAAGG CAGCCTCAGA CCCTAGCTTC CTCCCCAAGA TCAAGGCCTG GAGCAGGTGC 
asar q P q tla sspr srp gag 

1141 CCCCGGTGTT GCTCAGGAGC AGTCCTGGCT GGCCGGAGTG TCCACCAAGC CCACAGTCCC 
apgv aqe qsw lagv stk ptv 

1201 ATCTTCAGAA GCAGGAATCC AGCCAGTCCC TGTCCAGGGA AGCCCAGCTC TGCCAGGGGG 
psse agi q P v pvqg spa lpg 

1261 CTGTGTACCT AGAAATCATT TCAAGGGGAT GTCCpAAGAT TAAGCCTGTG GCT 



g c v p 



r n h 



f k g 



m 




78836-FL-R 
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FIGURE 11 

Query= INSP005b 

(431 letters) 

Database: All non-redundant GenBank CDS 
translations+PDB+SwissProt+PlR+PRF 

1,247,039 sequences; 397,579,747 total letters 

Searching done 

Score E 

Sequences producing significant alignments: (bits) Value 



ref |XP_141346 


•1| 


similar 


to hatching enzyme EHE7 (Anguilla. japon. . . 


540 


e-152 


dbj |BAB68513 . 


1| 


hatching 


enzyme EHE4 [Anguilla japonica] 


198 


le- 


49 


dbj |BAB68518. 


1 | 


hatching 


enzyme EHE13 [Anguilla japonica] 


198 


le- 


49 


dbj |BAB68516. 


1| 


hatching 


enzyme EHE10 [Anguilla japonica) 


197 


3e- 


49 


dbj |BAB68515 . 


1| 


hatching 


enzyme EHE7 [Anguilla japonica] 


196 


4e- 


49 


dbj |BAB68514. 


1| 


hatching 


enzyme EHE6 [Anguilla japonica] 


196 


7e- 


49 


dbj |BAB68517 . 


1 1 


hatching 


enzyme EHE12 [Anguilla japonica] 


194 


3e- 


48 


dbj |BAB68519. 


1 1 


hatching 


enzyme EHE14 [Anguilla japonica] 


191 


le- 


•47 


pir||C48826 high choriolytic hatching proteinase (EC 3.4.24.-) H. . . 


187 


3e- 


46 


dbj |BAA12146. 


11 


choriolysin H [Oryzias latipes) 


186 


4e- 


•46 



Top alignment to known metalloproteinase: 

>dbj | BAB68518.1) hatching enzyme EHE13 [Anguilla japonica] 
Length =271 

Score = 198 bits (503), Expect = le-49 

Identities = 103/233 (44%), Positives = 144/233 (61%), Gaps = 5/233 (2%) 

Query: 52 DKDIPAINQGLILEETPESSFLIEGDIIRPSPFRLLSATSNK--WPMGGSGWEVPFLLS 109 

D D I ++ S L+EGD++ + ++ +N+ W G+VEVP+ +S 

Sbjct: 41 DPDDLDITARII^SNNGSSEILMEGDMWSOTRNAINCWNNQCLWRKSSDGLVEVPYTVS 100 

Query: 110 SKYDEPSRQVTIiEALiAEFERSTC IRFVTYQDQRDF I SI I PMYGCFS SVGRSGGMQWSLA 169 

S++ ++ I A+ F TCIRFV QRDFISI GC+S +GR+GG QWSLA 
Sbjct: 101 SEFSYYHKKRIENAMETFNTETCIRFVPRSSQRDFISIESRDGCYSYLGRTGGKQVVSLA 160 

Query: 170 PT-CLQKGRGIVTjHELMHVTiGFWHEHTRADRDRYIRV 228 

C+ GI+ HEL H LGF+HEHTR+DRD Y+++NW + P NF + ++N+ T 
Sbjct: 161 RYGCVY - -HGI IQHELNHAIXSFYHEHTRSDRDEYVXINWENVAPHT^ 218 



Query: 
Sbjct: 



229 
219 



PYDYSSVMHYGRLAFSRRGLPTITPLWAPSVHIGQRWNLSASDITRVLKLYGC 281 
PYDY+S+MHYGR AFS G+ TITP+ P+ IGQR ++S DI R+ KLY C 
PYDYTSIMHYGRTAFSTNGMDTITPVPNPNQSIGQRRSMSKGDILRINKLYSC. 
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FIGURE 12 

Molecule: pCR4 TOPO-IPAAA78836-2 , 5269 bps DNA Circular 

File Name: 13296. cm5, dated 24 Oct 2002 



Description: 
vector* 



Ligation of inverted IPAAA78836v2 into pCR4-T0P0 linear 



Molecule Features: 



Type 


Start 


End 


Name 


Description 


REGION 


205 


221 


Ml 3 


rev priming site 


MARKER 


243 




T3 




REGION 


262 


294 




Polyl inker 1 


REGION 


294 


294 




TOPO cloning site* 


GENE 


1600 


307 


C IPAAA78836-2 




REGION 


1607 


1624 




• Polyl inker 


REGION 


1607 


1607 




'TOPO cloning site 


MARKER 


1659 




C T7 




REGION 


1667 


1682 


Ml 3 




GENE 


2471 


3265 


KanR 




GENE 


3469 


4329 


AmpR 




REGION 


4474 


5147 


pUC ori 
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FIGURE 13 

Active site residues are highlighted in grey below. 

W02 002/16566 - A2 MEGVGGLWPWVLGLLSLPGVI LGAPLAS SC AG ACGTS F PDGLT P EGTQASGDKD I 

AX526191 MSCCLVSPVGAPGICVCPCLSGPGVILGAPLASSCAGACGTSFPDGLTPEGTQASGDKDI 

INSP005 PREDICTION 

INSPO 0 5b MEGVGGLWPWVI/SLLSLPGVIIXSAPLASSCAGACGTSFPDGLTPEGTQASGD^ 

INSPOOSa ; 

WO2002/16566-A2 PAINQGLILEETPESSFLIEGDIIRPSPFRLLSATSNKWPMGGSGVVEVPFLLSSKYDEP 

AX526191 PAINQGLILEETPESSFLIEGDIIRPSPFRLLSATSNKWPMGGSGVVEVPFLLSSKYDEP 

INS POO 5 PREDICTION WPMGGSGWEVPFLLSSKYDEP 

INSPO 05b PAINQGL.ILEETPESSFLIEGDIIRPSPFRLLSATSNKWPMGGSGWEVPFLLSSKYDEP 

INSP005a MGGSGWEVPFLLSSKYDEP 



W02 0 0 2 / 1 6 5 6 6 - A2 SRQVI LEALAEFERSTC IRFVTYQDQRDF I S 1 1 PMYGCFS SVGRSGGMQWSLAPTCLQK 

AX5 2 6 1 9 1 SRQVI LEALAEFERSTC IRFVTYQDQRDF I S 1 1 PMYGCFS SVGRSGGMQWSLAPTCLQK 

INSP005 PREDICTION SHQV I LEALAEFERSTC I RFVTYQDQRDF I S 1 1 PMYGC F S SVGRSGGMQWSLAPTCLQK 

INSP005b SRQVILEAI^EFERSTCIRFVTYQDQRDFISIIPMYGCFSSVGRSGGMQWSLAPTCLQK 

INSPOOSa SRQVILEALAEFERSTCIRFVTYQDQRDFI SI I PMYGCFS SVGRSGGMQWSLAPTCLQK 

*.********************************************************** 

WO2002/16566-A2 GRGIVLHELMHVLGFWHEHTRADRDRYIRVNWNEILPGFEINFIKSRSSNMLTPYDYSSV 

AX526191 GRGIVLHELMHVLGFWHEHTRADRDRYIRVNWNEILPGFEINFIKSRSSNMLTPYDYSSV 

INSP005 PREDICTION GRGIVLHELMHVLGFWHEHTRADRDRYIRVNWNEILPGFEINFIKSQSSNMLTPYDYSSV 

INSPO 0 5b GRGIVX^ELMHVIiGFWHEHTRADRDRYIRVNWNEILPGFEINFIKSRSSNMLTPYDYSSV 

INSPO 05a GRGIVLHELMHVLGFWHEHTRADRDRYIRVNWNEILPGFEINFIKSQSSJM 

**********************************************;************* 

WO2002/16566-A2 MHYGRX.AFSRRGLPTITPLWAPSVHIGQRWNLSASDITRVLKLYGCSPSGPRPRGRG 

AX526191 MHYGRLAF SRRGL PT I T PLWAP SVH I GQRWNL S AS D I TRVLKL YGC S P SG PRP RGRG SHA 

INSPO 0 5 PREDICTION MHYGRLAFSRRGLPTITPLWAPSVHIGQRWNLSASDITRVLKLYGC 

INSPOOSb MHYGRLAFS RRGLPT ITPLWAPSVH IGQRWNLSASDI TRVLKLYGC S P SG PRPRGRG SHA 

INSPOOSa MHYGRLAF SRRGL PT I TPLWAP SVH I GQRWNL SASD I TRVLKLYGC S P SG PR PRGRG SHA 

********************************************** : . : : ; 

WO2002/16566-A2 EWHG RKVT 

AX52 6 191 HSTGRSPAPASLSLQRLLEALSAESRSPDPSGSSAGGQPVPAGPGESPHGWESPALKKLS 

INSP005 PREDICTION 

INSPOOSb HSTGRSPAPASLSLQRLLEALSAESRSPDPSGSSAGGQPVPAGPGESPHGWESPALKKLS 

INSPO 05a HSTGRSPAPASLSLQRLLEALSAESRSPDPSGSSAGGQPVPAGPGESPHGWESPALKKLS 
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WO2002/16566-A2 

AX526191 AEASARQPQTIASSPRSRPGAGAPGVAQEQSWLAGVSTKPTVPSSEAGIQPVPVQGSPAL 

INSP005 PREDICTION 

INSPO 0 5b AEASARQPQTLASSPRSRPGAGAPGVAQEQSWLAGVSTKPTVPSSEAGIQPVPVQGSPAL 

INSP005a AEASARQPQTLAS S PRSRPGAGAPGVAQEQ SWLAGVSTKPTVPS SEAGIQ PVFVQG SPAL 



WO2002/16566-A2 

AX526191 PGGCVPRNHFKGMSED 

INSPO 05 PREDICTION 

INSPO 05b PGGCVPRNHFKGMSED 

INSPO 05a PGGCVPRNHFKGMSED 
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FIGURE 14 



>XBSP005b 
SignalP-NN result: 



Signa 1P-NM prediotion <euk n»tworki)i IHSP885b 



8.6 



5 0.4 




C score 
S «©ore 
Y score 



HEGVCGUUPUVLGLLSLPGVILGAPLftSSCACftCGTSrPDGLTPEGTQASGDKDIPAIHOCLlLCCTPE 



88 



38 48 
Pos i -t ion 



58 



68 



# data 

>INSP005b 
# Measure 
max . C 
max. Y 
max . S 
mean S 



Position 
24 
24 
13 
1-23 



length = 70 
Value Cutoff 
1.000 0.33 
0.783 0.32 
0.991 0.82 
0.929 0.47 



signal peptide? 
YES 
YES 
YES 
YES 



# Most likely cleavage site between pos. 23 and 24: ILG-AP 
SignalP-HMM result: 

SignalP-HHM prediction <euk aodels)' IHSP885b 



1.9 


i i 


, 1 1 1 : 

Cleavage prob. 

n-r*gion prob. 

h-region prob. 

c-region prob. 


8.8 






8.6 






8.4 






0-2 










Vv ^L 1 1 1 1 1 i.i * Vi_ 


8.8 


HEGVGGLUPU VLGLLSLPGV I LGAPLASSC AG ACGTSFPDGLTPEGTQ ASGDKD IPAINQGLILEETPE! 
1 1 1 1 1 1 



18 



28 



68 



Position 



data 
>INSP005b 

Prediction: Signal peptide 
Signal peptide probability: 0.996 
Signal anchor probability: 0.003 

Max cleavage site probability: 0.302 between pos. 23 and 24 
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FIGURE 15A 



17/18 



xu / 539847 

PCT/GB2003/005664 



2000- 



1000- 
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FIGURE 15B 



ALAT (8h) 



pDEST 



2000- 



IPAAA78836-2 



I 



WO 2004/056983 



18/18 



10 / 53984? 

PCT/GB2003/005664 



FIGURE 16A 
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FIGURE 16B 



TNF (1h30) 
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